Skip to main content

Reply to this post | See parent post | Go Back
View Post [edit]

Poster: EbbeHove Date: Dec 25, 2015 8:55am
Forum: texts Subject: Re: Quality control of scanned books

Hi kristinmak

Looking at the amount of uploads I can not help thinking that archive.org is slowly drowning in irrelevant material. Imagine visiting a physical archive where the items you pick from the shelves have no proper titles or descriptions and turn out to be anything but texts. Imagine if some of the items are black boxes that you have to put your hand in, to experience what is inside (meaning downloading unknown content). Welcome to archive.org. Your complaint is understood and shared, kristinmak.

What can be done? Well, this organization with only 140 employees surely needs help, and I do not mean money or volunteers to "welcome and seat guests", I mean editors. Voluntary editors who can categorize material and quarantine or delete the stuff that is illegal or just spam.

Ebbe

Reply to this post
Reply [edit]

Poster: Jeff Kaplan Date: Jan 2, 2016 12:47pm
Forum: texts Subject: Re: Quality control of scanned books

hi,

there is a beta flagging system in place on each details page. if you feel and item should be flagged for review feel free to do just that. it may take a bit of time to sort through all flagged items since the system is not yet fully operational but marking the items for review is a first step.

Reply to this post
Reply [edit]

Poster: EbbeHove Date: Jan 3, 2016 11:27pm
Forum: texts Subject: Re: Quality control of scanned books

Hello mr. Kaplan
Well, flagging content really just seems like barking at the moon.
Why not let your regular users do something more productive?
For instance, since December 25th, the Internet Archive has received 43 "Text" items with Language=Danish. Problem is: Not one of them is in Danish. Most of them are (as far as I can see) music files or bits of software, and some are videos. I do not want to flag them and wait and wait. I want to be able to remove the faulty language field and the faulty text field so future searches by anyone will not be polluted by this material. But the Internet Archive will not allow this. Do you have any plans to review your attitude on external editors?
Ebbe Hove

Reply to this post
Reply [edit]

Poster: xensyria Date: Jan 7, 2016 6:00am
Forum: texts Subject: Re: Quality control of scanned books

Firstly, IA is great, and while I sometimes share the frustration mentioned above, the number of times it's helped without problem massively outweighs this.

I agree that IA has the potential for at least as strong a community as, say, Wikimedia Commons, if the platform allowed for it.

Are there any consequences that we haven't thought of that might make them reluctant to do so? (e.g. might IA become more copyright conscious along the lines of Wikimedia, rather than its current more live and let live approach, which, incidentally, seems more in line with current DMCA legislation than Wiki).

Reply to this post
Reply [edit]

Poster: EbbeHove Date: Jan 9, 2016 12:37am
Forum: texts Subject: Re: Quality control of scanned books

Hello xensyria
Glad we agree on the need for a community.
For IA to embrace the idea, they could start with a small group of volunteer editors, with limited editing rights, just to see that it works. Personally I would be happy to be able to edit the "language" parameter, to add and remove "topics" and to be able to reclassify material (to move items erroneously marked as "text"). I see no use of the right to delete material - this is just to clean up categories so searches do not get cluttered by loads of irrelevant files.
Let us hope IA has the courage to trust their users.
Ebbe

Reply to this post
Reply [edit]

Poster: xensyria Date: Jan 9, 2016 3:46pm
Forum: texts Subject: Re: Quality control of scanned books

I would love to see a strong community here, and there are intermediary steps as you suggest (and ways to manage the risks of incompetent/rogue editors:) which could either be stored as a list of edits waiting to go live (pending changes), or as live edits that need to be checked (patrolling). It would basically be the current flagging system, but without the person checking the flag having to do the work!

But I've been thinking a bit more about the IA's perspective. I guess community curation isn't part of their vision of an information archive; the resource itself looks to be the focus.

The model of uploaders (some very prolific with, I think, more powers surrounding things like collections etc.) adding the info as they go has worked to get it to where it is.

There do seem to be tentative moves towards this to address the problems you mention (like the flagging system), but it does seem to be just that: trying to solve a problem rather than embracing an opportunity.

Perhaps there's also a bottleneck there in terms of resources: there might not be the funds to divert to redesign the platform to enable a community to flourish.

I'm not sure we can influence them to go for this either, other than by showing that we're here if they ever decide to go down this route. Would be great to hear any official thoughts on it though!

This post was modified by xensyria on 2016-01-09 23:46:00